Enhancing Lempel-Ziv Codes Using an On-Line Variable Length Binary Encoding
نویسندگان
چکیده
LZW Algorithm is the most popular dictionary-based adaptive text compression scheme [l]. In LZW algorithm, a changing dictionary contains common strings that have been encountered so far in the text. Motivation of this research is to explore an on-line variablelength binary encoding. We apply this encoding to LZW codes for remedy of the problem that we discussed in our earlier paper in DCC’95 [2]. We call it LZWAJ algorithm. We developed a novel methodology for an on-line variable-length binary encoding of a dynamically growing set of integers by mapping the integers into the leaf nodes of a special binary tree called the “phase in binary tree”. This tree has many interesting and useful properties [3]. The length of a path from the root node to any leaf node is upper bound by log, n, where n is number of leaf nodes in the tree corresponding to n integers in the set. As a result, length of each of the n phase in binary codes is less than or equal to log, n. The interesting property of this encoding is that the code of an integer i 5 n can be generated from the unsigned binary representation of i and n only without physically constructing the “phase in binary tree” data structure and very easy to implement. But we have used the “phase in binary tree” in this paper to understand the underlying logic of the encoding. To show the effectiveness of this encoding, we have used it to encode the output of LZW, although the same methodology can be applied in any Lempel-Ziv code. The software has been implemented and tested with different kind of texts. The compression performance of the proposed scheme is much better compared to other known methods. Acknowledgement : We are indebted to Professor James A. Storer for his valuable suggestions towards the benchmarking of the compression performance. We are also thankful to Professor Amar Mukherjee for his insightful comments in this research. [l] Welch, T., A Technique for High-Performance Data Compression, IEEE Comp~ter,l7(6)8-19, 1984. [2] Acharya, T. and Mukherjee, A., A Tree-based Binary Encoding of Text Using LZW Algorithm, Data Compression Conference, 1995. [3] Acharya, T. and JQJB, J.,“An On-line Variable Length Binary Encoding”, TR:CS-TR-3442, UMIACS-TR-95-39, U. of Maryland at College Park, April 1995. ‘Also, Department of Electrical Engineering, University of Maryland, College Park, MD 20742 1068-0314/96$5.00
منابع مشابه
An On - line Variable Length
We present a methodology of an on-line variable-length binary encoding of a set of integers. The basic principle of this methodology is to maintain the preex property amongst the codes assigned on-line to a set of integers growing dynamically. The preex property enables unique decoding of a string of elements from this set. To show the utility of this on-line variable length binary encoding, we...
متن کاملFaster Compact On-Line Lempel-Ziv Factorization
We present a new on-line algorithm for computing the Lempel-Ziv factorization of a string that runs in O(N logN) time and uses only O(N log σ) bits of working space, where N is the length of the string and σ is the size of the alphabet. This is a notable improvement compared to the performance of previous on-line algorithms using the same order of working space but running in either O(N log3 N)...
متن کاملRedundancy of the Lempel-Ziv incremental parsing rule
The Lempel-Ziv codes are universal variable-tofixed length codes that have become virtually standard in practical lossless data compression. For any given source output string from a Markov or unifilar source, we upper-bound the difference between the number of binary digits needed to encode the string and the self-information of the string. We use this result to demonstrate that for unifilar o...
متن کاملUniversal Variable-to-Fixed Length Codes Achieving Optimum Large Deviations Performance for Empirical Compression Ratio
This paper clari es two variable-toxed length codes which achieve optimum large deviations performance of empirical compression ratio. One is Lempel-Ziv code with xed number of phrases, and the other is an arithmetic code with xed codeword length. It is shown that Lempel-Ziv code is asymptotically optimum in the above sense, for the class of nite-alphabet and nite-state sources, and that the ar...
متن کاملModelling the EAH Data Compression Algorithm using Graph Theory
Adaptive codes associate variable-length codewords to symbols being encoded depending on the previous symbols in the input data string. This class of codes has been introduced in [9] as a new class of non-standard variable-length codes. New algorithms for data compression, based on adaptive codes of order one, have been presented in [10], where we have behaviorally shown that for a large class ...
متن کامل